{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TensorFlow Tutorial\n",
    "\n",
    "欢迎来到本周的编程任务。到目前为止，你一直使用numpy来建立神经网络。现在我们将引导您深入学习框架，让您更容易地建立神经网络。TensorFlow，PaddlePaddle，Torch，Caffe，Keras等机器学习框架可以显着加速您的机器学习开发。这些框架有很多文档，你可以随意阅读。在这个任务中，您将学习如何在TensorFlow中执行以下操作：\n",
    "\n",
    "- 初始化变量\n",
    "- 开始你自己的会话\n",
    "- 训练算法\n",
    "- 实现一个神经网络\n",
    "\n",
    "编程框架不仅可以缩短您的编码时间，但有时也可以执行优化来加速您的代码。\n",
    "\n",
    "\n",
    "## 1 - Exploring the Tensorflow Library\n",
    "\n",
    "首先导入相关库\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import math\n",
    "import numpy as np\n",
    "import h5py\n",
    "import matplotlib.pyplot as plt\n",
    "import tensorflow as tf\n",
    "from tensorflow.python.framework import ops\n",
    "from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict\n",
    "\n",
    "%matplotlib inline\n",
    "np.random.seed(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们将从一个例子开始，这个例子里需要计算一个训练样例的损失。\n",
    "$$loss = \\mathcal{L}(\\hat{y}, y) = (\\hat y^{(i)} - y^{(i)})^2 \\tag{1}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "9\n"
     ]
    }
   ],
   "source": [
    "y_hat = tf.constant(36, name='y_hat')            # Define y_hat constant. Set to 36.\n",
    "y = tf.constant(39, name='y')                    # Define y. Set to 39\n",
    "\n",
    "loss = tf.Variable((y - y_hat)**2, name='loss')  # Create a variable for the loss\n",
    "\n",
    "init = tf.global_variables_initializer()         # When init is run later (session.run(init)),\n",
    "                                                 # the loss variable will be initialized and ready to be computed\n",
    "with tf.Session() as session:                    # Create a session and print the output\n",
    "    session.run(init)                            # Initializes the variables\n",
    "    print(session.run(loss))                     # Prints the loss"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用TensorFlow编写和运行程序通常有以下步骤：\n",
    "\n",
    "1. 创建尚未执行/评估的张量（变量）\n",
    "2. 在这些张量之间写入操作。\n",
    "3. 初始化您的张量。\n",
    "4. 创建一个会话。\n",
    "5. 运行会话。这将运行你上面已经完成的写操作。 \n",
    "\n",
    "Therefore, when we created a variable for the loss, we simply defined the loss as a function of other quantities, but did not evaluate its value. To evaluate it, we had to run `init=tf.global_variables_initializer()`. That initialized the loss variable, and in the last line we were finally able to evaluate the value of `loss` and print its value.\n",
    "\n",
    "现在让我们看一个简单的例子。运行下面的单元格："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tensor(\"Mul:0\", shape=(), dtype=int32)\n"
     ]
    }
   ],
   "source": [
    "a = tf.constant(2)\n",
    "b = tf.constant(10)\n",
    "c = tf.multiply(a,b)\n",
    "print(c)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "正如所料，你不会看到20！你得到是一个类型为“int32”且没有shape属性的张量。你目前所做的只是把它放到“计算图”中，你还没有运行这个计算。为了让这两个数实际上相乘，你需要去创建一个会话并运行它."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "20\n"
     ]
    }
   ],
   "source": [
    "sess = tf.Session()\n",
    "print(sess.run(c))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great! 总结一下, **，记得初始化你的变量，创建一个会话并运行会话中的操作**. \n",
    "\n",
    "接下来，您还必须了解占位符.占位符是一个您可以稍后指定值的对象. \n",
    "要为占位符指定值，可以使用 \"feed dictionary\" (`feed_dict` variable)传值。 下面，我们为x创建了一个占位符。这允许我们稍后在运行会话时传入一个数字。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "6\n"
     ]
    }
   ],
   "source": [
    "# Change the value of x in the feed_dict\n",
    "\n",
    "x = tf.placeholder(tf.int64, name = 'x')\n",
    "print(sess.run(2 * x, feed_dict = {x: 3}))\n",
    "sess.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "当你第一次定义x你不必为它指定一个值。占位符只是一个变量，您将在稍后运行会话时将数据分配给该变量。\n",
    "\n",
    "以下是发生了的事情：当您指定计算所需的操作时，您正在告诉TensorFlow如何构建计算图。计算图可以有一些占位符，其值将在稍后指定。最后，当你运行会话时，你正在告诉TensorFlow执行计算图。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.1 - Linear function\n",
    "\n",
    "让我们通过计算下面的等式来开始这个编程练习: $Y = WX + b$, where $W$ and $X$ are random matrices and b is a random vector. \n",
    "\n",
    "**Exercise**: 计算 $WX + b$， $W, X$, 和 $b$ 是从随机正太分布中抽取的. W is of shape (4, 3), X is (3,1) and b is (4,1). 下面是一个小例子，告诉你如何定义一个具有shape（3,1）的常量X:\n",
    "```python\n",
    "X = tf.constant(np.random.randn(3,1), name = \"X\")\n",
    "\n",
    "```\n",
    "下面的方法可能对你有用: \n",
    "- tf.matmul(..., ...) to do a matrix multiplication\n",
    "- tf.add(..., ...) to do an addition\n",
    "- np.random.randn(...) to initialize randomly\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: linear_function\n",
    "\n",
    "def linear_function():\n",
    "    \"\"\"\n",
    "    Implements a linear function: \n",
    "            Initializes W to be a random tensor of shape (4,3)\n",
    "            Initializes X to be a random tensor of shape (3,1)\n",
    "            Initializes b to be a random tensor of shape (4,1)\n",
    "    Returns: \n",
    "    result -- runs the session for Y = WX + b \n",
    "    \"\"\"\n",
    "    \n",
    "    np.random.seed(1)\n",
    "    \n",
    "    ### START CODE HERE ### (4 lines of code)\n",
    "    X = tf.constant(np.random.randn(3,1), name = \"X\")\n",
    "    W = tf.constant(np.random.randn(4,3), name = \"W\")\n",
    "    b = tf.constant(np.random.randn(4,1), name = \"b\")\n",
    "#     print(X)\n",
    "#     print(W)\n",
    "#     print(b)\n",
    "    Y = tf.add(tf.matmul(W,X), b)\n",
    "    ### END CODE HERE ### \n",
    "    \n",
    "    # Create the session using tf.Session() and run it with sess.run(...) on the variable you want to calculate\n",
    "    \n",
    "    ### START CODE HERE ###\n",
    "    sess = tf.Session()\n",
    "    result = sess.run(Y)\n",
    "    ### END CODE HERE ### \n",
    "    \n",
    "    # close the session \n",
    "    sess.close()\n",
    "\n",
    "    return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "result = [[-2.15657382]\n",
      " [ 2.95891446]\n",
      " [-1.08926781]\n",
      " [-0.84538042]]\n"
     ]
    }
   ],
   "source": [
    "print( \"result = \" + str(linear_function()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*** Expected Output ***: \n",
    "\n",
    "<table> \n",
    "<tr> \n",
    "<td>\n",
    "**result**\n",
    "</td>\n",
    "<td>\n",
    "[[-2.15657382]\n",
    " [ 2.95891446]\n",
    " [-1.08926781]\n",
    " [-0.84538042]]\n",
    "</td>\n",
    "</tr> \n",
    "\n",
    "</table> "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.2 - Computing the sigmoid \n",
    "Great! 你刚刚实现了一个线性函数. Tensorflow 提供了很多类似 `tf.sigmoid` 和 `tf.softmax`这样的常用的神经网络功能. For this exercise lets compute the sigmoid function of an input. \n",
    "\n",
    "你将使用占位符变量`x`来做这个练习. When running the session, you should use the feed dictionary to pass in the input `z`. In this exercise, you will have to (i) create a placeholder `x`, (ii) define the operations needed to compute the sigmoid using `tf.sigmoid`, and then (iii) run the session. \n",
    "\n",
    "** Exercise **: 实现下面的 sigmoid 函数. You should use the following: \n",
    "\n",
    "- `tf.placeholder(tf.float32, name = \"...\")`\n",
    "- `tf.sigmoid(...)`\n",
    "- `sess.run(..., feed_dict = {x: z})`\n",
    "\n",
    "请注意，有两种典型的方法来去创建和使用tensor流中的会话：\n",
    "\n",
    "**Method 1:**\n",
    "```python\n",
    "sess = tf.Session()\n",
    "# Run the variables initialization (if needed), run the operations\n",
    "result = sess.run(..., feed_dict = {...})\n",
    "sess.close() # Close the session\n",
    "```\n",
    "**Method 2:**\n",
    "```python\n",
    "with tf.Session() as sess: \n",
    "    # run the variables initialization (if needed), run the operations\n",
    "    result = sess.run(..., feed_dict = {...})\n",
    "    # This takes care of closing the session for you :)\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: sigmoid\n",
    "\n",
    "def sigmoid(z):\n",
    "    \"\"\"\n",
    "    Computes the sigmoid of z\n",
    "    \n",
    "    Arguments:\n",
    "    z -- input value, scalar or vector\n",
    "    \n",
    "    Returns: \n",
    "    results -- the sigmoid of z\n",
    "    \"\"\"\n",
    "    \n",
    "    ### START CODE HERE ### ( approx. 4 lines of code)\n",
    "    # Create a placeholder for x. Name it 'x'.\n",
    "    x = tf.placeholder(tf.float32, name=\"x\")\n",
    "\n",
    "    # compute sigmoid(x)\n",
    "    sigmoid = tf.sigmoid(x)\n",
    "\n",
    "    # Create a session, and run it. Please use the method 2 explained above. \n",
    "    # You should use a feed_dict to pass z's value to x. \n",
    "    with tf.Session() as sess:\n",
    "        # Run session and call the output \"result\"\n",
    "        result = sess.run(sigmoid, feed_dict={x:z})\n",
    "    \n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    return result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sigmoid(0) = 0.5\n",
      "sigmoid(12) = 0.999994\n"
     ]
    }
   ],
   "source": [
    "print (\"sigmoid(0) = \" + str(sigmoid(0)))\n",
    "print (\"sigmoid(12) = \" + str(sigmoid(12)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*** Expected Output ***: \n",
    "\n",
    "<table> \n",
    "<tr> \n",
    "<td>\n",
    "**sigmoid(0)**\n",
    "</td>\n",
    "<td>\n",
    "0.5\n",
    "</td>\n",
    "</tr>\n",
    "<tr> \n",
    "<td>\n",
    "**sigmoid(12)**\n",
    "</td>\n",
    "<td>\n",
    "0.999994\n",
    "</td>\n",
    "</tr> \n",
    "\n",
    "</table> "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font color='blue'>\n",
    "**总结一下，你应该知道怎样去：**\n",
    "1. 创建占位符\n",
    "2. 指定与您要计算的操作对应的计算图\n",
    "3. 创建会话\n",
    "4. 运行会话，必要时使用Feed字典来指定占位符变量的值. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.3 -  Computing the Cost\n",
    "\n",
    "您也可以使用内置函数来计算您的神经网络的损失. So instead of needing to write code to compute this as a function of $a^{[2](i)}$ and $y^{(i)}$ for i=1...m: \n",
    "$$ J = - \\frac{1}{m}  \\sum_{i = 1}^m  \\large ( \\small y^{(i)} \\log a^{ [2] (i)} + (1-y^{(i)})\\log (1-a^{ [2] (i)} )\\large )\\small\\tag{2}$$\n",
    "\n",
    "在tensorflow中你可以用一行代码中做到这一点！\n",
    "\n",
    "**Exercise**: 实现交叉熵损失. The function you will use is: \n",
    "\n",
    "\n",
    "- `tf.nn.sigmoid_cross_entropy_with_logits(logits = ...,  labels = ...)`\n",
    "\n",
    "Your code should input `z`, compute the sigmoid (to get `a`) and then compute the cross entropy cost $J$. All this can be done using one call to `tf.nn.sigmoid_cross_entropy_with_logits`, which computes\n",
    "\n",
    "$$- \\frac{1}{m}  \\sum_{i = 1}^m  \\large ( \\small y^{(i)} \\log \\sigma(z^{[2](i)}) + (1-y^{(i)})\\log (1-\\sigma(z^{[2](i)})\\large )\\small\\tag{2}$$\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: cost\n",
    "\n",
    "def cost(logits, labels):\n",
    "    \"\"\"\n",
    "    Computes the cost using the sigmoid cross entropy\n",
    "    \n",
    "    Arguments:\n",
    "    logits -- vector containing z, output of the last linear unit (before the final sigmoid activation)\n",
    "    labels -- vector of labels y (1 or 0) \n",
    "    \n",
    "    Note: What we've been calling \"z\" and \"y\" in this class are respectively called \"logits\" and \"labels\" \n",
    "    in the TensorFlow documentation. So logits will feed into z, and labels into y. \n",
    "    \n",
    "    Returns:\n",
    "    cost -- runs the session of the cost (formula (2))\n",
    "    \"\"\"\n",
    "    \n",
    "    ### START CODE HERE ### \n",
    "    \n",
    "    # Create the placeholders for \"logits\" (z) and \"labels\" (y) (approx. 2 lines)\n",
    "    z = tf.placeholder(tf.float32, name=\"logits\")\n",
    "    y = tf.placeholder(tf.float32, name=\"labels\")\n",
    "    \n",
    "    # Use the loss function (approx. 1 line)\n",
    "    cost = tf.nn.sigmoid_cross_entropy_with_logits(logits=z, labels=y)\n",
    "    \n",
    "    # Create a session (approx. 1 line). See method 1 above.\n",
    "    sess = tf.Session()\n",
    "    \n",
    "    # Run the session (approx. 1 line).\n",
    "    cost = sess.run(cost, feed_dict={z:logits, y:labels})\n",
    "    \n",
    "    \n",
    "    # Close the session (approx. 1 line). See method 1 above.\n",
    "    sess.close()\n",
    "    \n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    return cost"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 0.54983395  0.59868765  0.66818774  0.71094948]\n"
     ]
    }
   ],
   "source": [
    "logits = sigmoid(np.array([0.2,0.4,0.7,0.9]))\n",
    "cost = cost(logits, np.array([0,0,1,1]))\n",
    "print(logits)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** Expected Output** : \n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **cost**\n",
    "        </td>\n",
    "        <td>\n",
    "        [ 1.00538719  1.03664088  0.41385433  0.39956614]\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.4 - Using One Hot encodings\n",
    "\n",
    "Many times in deep learning you will have a y vector with numbers ranging from 0 to C-1, where C is the number of classes. If C is for example 4, then you might have the following y vector which you will need to convert as follows:\n",
    "\n",
    "\n",
    "<img src=\"images/onehot.png\" style=\"width:600px;height:150px;\">\n",
    "\n",
    "This is called a \"one hot\" encoding, because in the converted representation exactly one element of each column is \"hot\" (meaning set to 1). To do this conversion in numpy, you might have to write a few lines of code. In tensorflow, you can use one line of code: \n",
    "\n",
    "- tf.one_hot(labels, depth, axis) \n",
    "\n",
    "**Exercise:** Implement the function below to take one vector of labels and the total number of classes $C$, and return the one hot encoding. Use `tf.one_hot()` to do this. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: one_hot_matrix\n",
    "\n",
    "def one_hot_matrix(labels, C):\n",
    "    \"\"\"\n",
    "    Creates a matrix where the i-th row corresponds to the ith class number and the jth column\n",
    "                     corresponds to the jth training example. So if example j had a label i. Then entry (i,j) \n",
    "                     will be 1. \n",
    "                     \n",
    "    Arguments:\n",
    "    labels -- vector containing the labels \n",
    "    C -- number of classes, the depth of the one hot dimension\n",
    "    \n",
    "    Returns: \n",
    "    one_hot -- one hot matrix\n",
    "    \"\"\"\n",
    "    \n",
    "    ### START CODE HERE ###\n",
    "    \n",
    "    # Create a tf.constant equal to C (depth), name it 'C'. (approx. 1 line)\n",
    "    C = tf.constant(C)\n",
    "    \n",
    "    # Use tf.one_hot, be careful with the axis (approx. 1 line)\n",
    "    one_hot_matrix = tf.one_hot(labels, C, axis=0)\n",
    "    \n",
    "    # Create the session (approx. 1 line)\n",
    "    sess = tf.Session()\n",
    "    \n",
    "    # Run the session (approx. 1 line)\n",
    "    one_hot = sess.run(one_hot_matrix)\n",
    "    \n",
    "    # Close the session (approx. 1 line). See method 1 above.\n",
    "    sess.close()\n",
    "    \n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    return one_hot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "one_hot = [[ 0.  0.  0.  1.  0.  0.]\n",
      " [ 1.  0.  0.  0.  0.  1.]\n",
      " [ 0.  1.  0.  0.  1.  0.]\n",
      " [ 0.  0.  1.  0.  0.  0.]]\n"
     ]
    }
   ],
   "source": [
    "labels = np.array([1,2,3,0,2,1])\n",
    "one_hot = one_hot_matrix(labels, C = 4)\n",
    "print (\"one_hot = \" + str(one_hot))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Expected Output**: \n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **one_hot**\n",
    "        </td>\n",
    "        <td>\n",
    "        [[ 0.  0.  0.  1.  0.  0.]\n",
    " [ 1.  0.  0.  0.  0.  1.]\n",
    " [ 0.  1.  0.  0.  1.  0.]\n",
    " [ 0.  0.  1.  0.  0.  0.]]\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.5 - Initialize with zeros and ones\n",
    "\n",
    "现在您将学习如何用 0 和 1 去初始化一个向量. The function you will be calling is `tf.ones()`. To initialize with zeros you could use tf.zeros() instead. These functions take in a shape and return an array of dimension shape full of zeros and ones respectively. \n",
    "\n",
    "**Exercise:** Implement the function below to take in a shape and to return an array (of the shape's dimension of ones). \n",
    "\n",
    " - tf.ones(shape)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: ones\n",
    "\n",
    "def ones(shape):\n",
    "    \"\"\"\n",
    "    Creates an array of ones of dimension shape\n",
    "    \n",
    "    Arguments:\n",
    "    shape -- shape of the array you want to create\n",
    "        \n",
    "    Returns: \n",
    "    ones -- array containing only ones\n",
    "    \"\"\"\n",
    "    \n",
    "    ### START CODE HERE ###\n",
    "    \n",
    "    # Create \"ones\" tensor using tf.ones(...). (approx. 1 line)\n",
    "    ones = tf.ones(shape)\n",
    "    \n",
    "    # Create the session (approx. 1 line)\n",
    "    sess = tf.Session()\n",
    "    \n",
    "    # Run the session to compute 'ones' (approx. 1 line)\n",
    "    ones = sess.run(ones)\n",
    "    \n",
    "    # Close the session (approx. 1 line). See method 1 above.\n",
    "    sess.close()\n",
    "    \n",
    "    ### END CODE HERE ###\n",
    "    return ones"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ones = [ 1.  1.  1.]\n"
     ]
    }
   ],
   "source": [
    "print (\"ones = \" + str(ones([3])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Expected Output:**\n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **ones**\n",
    "        </td>\n",
    "        <td>\n",
    "        [ 1.  1.  1.]\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2 - 用tensorflow构建第一个神经网络\n",
    "\n",
    "在这个任务中，您将使用tensorflow建立一个神经网络. 通常有如下两步：\n",
    "\n",
    "- Create the computation graph\n",
    "- Run the graph\n",
    "\n",
    "### 2.0 - Problem statement: SIGNS Dataset\n",
    "\n",
    "某天下午，你和一群朋友一起决定去教电脑破解手语。你们花了几个小时在白墙前拍摄照片，然后得到了下面的数据集。现在你的工作是建立一个算法，以促进语言障碍人士和不懂手语的人进行沟通。\n",
    "\n",
    "- **训练集**: 1080个图片（64×64像素），代表从0到5的数字（每个数字180张图片）.\n",
    "- **测试集**: 120张图片（64×64像素），代表从0到5的数字（每张数20张）.\n",
    "\n",
    "请注意，这是SIGNS数据集的一个子集。完整的数据集包含更多的signs。\n",
    "\n",
    "以下是每个数字的示例，以及如何用标签表示示例。 These are the original pictures, before we lowered the image resolutoion to 64 by 64 pixels.\n",
    "<img src=\"images/hands.png\" style=\"width:800px;height:350px;\"><caption><center> <u><font color='purple'> **Figure 1**</u><font color='purple'>: SIGNS dataset <br> <font color='black'> </center>\n",
    "\n",
    "\n",
    "运行以下代码以加载数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "module 'h5py' has no attribute 'File'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-74-798ad2cbab29>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[1;31m# Loading the dataset\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mtrain_dataset\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mh5py\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mFile\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'1212.mp'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"r\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      3\u001b[0m \u001b[0mX_train_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mY_train_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mX_test_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mY_test_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mclasses\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mload_dataset\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mAttributeError\u001b[0m: module 'h5py' has no attribute 'File'"
     ]
    }
   ],
   "source": [
    "# Loading the dataset\n",
    "X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "更改下面的索引并运行单元格去可视化数据集中的一些样例。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Example of a picture\n",
    "index = 0\n",
    "plt.imshow(X_train_orig[index])\n",
    "print (\"y = \" + str(np.squeeze(Y_train_orig[:, index])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "和往常一样，将图像数据集展开变平，然后除以255进行归一化。最重要的是，你要像图一所示，把每个标签转换为一个热点向量(a one-hot vector). 运行下面的单元格来执行此操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Flatten the training and test images\n",
    "X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T\n",
    "X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T\n",
    "# Normalize image vectors\n",
    "X_train = X_train_flatten/255.\n",
    "X_test = X_test_flatten/255.\n",
    "# Convert training and test labels to one hot matrices\n",
    "Y_train = convert_to_one_hot(Y_train_orig, 6)\n",
    "Y_test = convert_to_one_hot(Y_test_orig, 6)\n",
    "\n",
    "print (\"number of training examples = \" + str(X_train.shape[1]))\n",
    "print (\"number of test examples = \" + str(X_test.shape[1]))\n",
    "print (\"X_train shape: \" + str(X_train.shape))\n",
    "print (\"Y_train shape: \" + str(Y_train.shape))\n",
    "print (\"X_test shape: \" + str(X_test.shape))\n",
    "print (\"Y_test shape: \" + str(Y_test.shape))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**请注意** that 12288 comes from $64 \\times 64 \\times 3$. 每个图像是64* 64 像素的正方形 , 3 是 RGB 颜色"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**您的目标**是建立一个能够高准确度识别符号的算法。要做到这一点，你要建立一个tensorflow模型，这个模型和你之前用 python 实现的猫咪识别模型几乎一样(不同的是现在使用softmax输出). 这是一个很好的场合去对比两者实现的差异（numpy和tensorflow）。\n",
    "\n",
    "**The model** is *LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX*. SIGMOID输出层已被转换为SOFTMAX. A SOFTMAX layer generalizes SIGMOID to when there are more than two classes. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1 - Create placeholders\n",
    "\n",
    "你的第一个任务是为`X` 和 `Y`创建占位符. 这将允许您稍后在运行会话时传递您的训练数据\n",
    "\n",
    "**Exercise:** 执行下面的函数去创建tensorflow中的占位符"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: create_placeholders\n",
    "\n",
    "def create_placeholders(n_x, n_y):\n",
    "    \"\"\"\n",
    "    Creates the placeholders for the tensorflow session.\n",
    "    \n",
    "    Arguments:\n",
    "    n_x -- scalar, size of an image vector (num_px * num_px = 64 * 64 * 3 = 12288)\n",
    "    n_y -- scalar, number of classes (from 0 to 5, so -> 6)\n",
    "    \n",
    "    Returns:\n",
    "    X -- placeholder for the data input, of shape [n_x, None] and dtype \"float\"\n",
    "    Y -- placeholder for the input labels, of shape [n_y, None] and dtype \"float\"\n",
    "    \n",
    "    Tips:\n",
    "    - You will use None because it let's us be flexible on the number of examples you will for the placeholders.\n",
    "      In fact, the number of examples during test/train is different.\n",
    "    \"\"\"\n",
    "\n",
    "    ### START CODE HERE ### (approx. 2 lines)\n",
    "    X = tf.placeholder(tf.float32, shape=(n_x,None), name=\"X\")\n",
    "    Y = tf.placeholder(tf.float32, shape=(n_y,None), name=\"Y\")\n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    return X, Y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "X, Y = create_placeholders(12288, 6)\n",
    "print (\"X = \" + str(X))\n",
    "print (\"Y = \" + str(Y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Expected Output**: \n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **X**\n",
    "        </td>\n",
    "        <td>\n",
    "        Tensor(\"Placeholder_1:0\", shape=(12288, ?), dtype=float32) (not necessarily Placeholder_1)\n",
    "        </td>\n",
    "    </tr>\n",
    "    <tr> \n",
    "        <td>\n",
    "            **Y**\n",
    "        </td>\n",
    "        <td>\n",
    "        Tensor(\"Placeholder_2:0\", shape=(10, ?), dtype=float32) (not necessarily Placeholder_2)\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2 - Initializing the parameters\n",
    "\n",
    "你的第二个任务是初始化tensorflow的参数。\n",
    "\n",
    "**Exercise:** 执行下面的函数来初始化tensorflow中的参数。You are going use Xavier Initialization for weights and Zero Initialization for biases. The shapes are given below. As an example, to help you, for W1 and b1 you could use: \n",
    "\n",
    "```python\n",
    "W1 = tf.get_variable(\"W1\", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n",
    "b1 = tf.get_variable(\"b1\", [25,1], initializer = tf.zeros_initializer())\n",
    "```\n",
    "请使用 `seed = 1` ，以确保您的结果和我们的相匹配。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: initialize_parameters\n",
    "\n",
    "def initialize_parameters():\n",
    "    \"\"\"\n",
    "    Initializes parameters to build a neural network with tensorflow. The shapes are:\n",
    "                        W1 : [25, 12288]\n",
    "                        b1 : [25, 1]\n",
    "                        W2 : [12, 25]\n",
    "                        b2 : [12, 1]\n",
    "                        W3 : [6, 12]\n",
    "                        b3 : [6, 1]\n",
    "    \n",
    "    Returns:\n",
    "    parameters -- a dictionary of tensors containing W1, b1, W2, b2, W3, b3\n",
    "    \"\"\"\n",
    "    \n",
    "    tf.set_random_seed(1)                   # so that your \"random\" numbers match ours\n",
    "        \n",
    "    ### START CODE HERE ### (approx. 6 lines of code)\n",
    "    W1 = tf.get_variable(\"W1\", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n",
    "    b1 = tf.get_variable(\"b1\", [25,1], initializer = tf.zeros_initializer())\n",
    "    W2 = tf.get_variable(\"W2\", [12,25], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n",
    "    b2 = tf.get_variable(\"b2\", [12,1], initializer = tf.zeros_initializer())\n",
    "    W3 = tf.get_variable(\"W3\", [6,12], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n",
    "    b3 = tf.get_variable(\"b3\", [6,1], initializer = tf.zeros_initializer())\n",
    "    ### END CODE HERE ###\n",
    "\n",
    "    parameters = {\"W1\": W1,\n",
    "                  \"b1\": b1,\n",
    "                  \"W2\": W2,\n",
    "                  \"b2\": b2,\n",
    "                  \"W3\": W3,\n",
    "                  \"b3\": b3}\n",
    "    \n",
    "    return parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "tf.reset_default_graph()\n",
    "with tf.Session() as sess:\n",
    "    parameters = initialize_parameters()\n",
    "    print(\"W1 = \" + str(parameters[\"W1\"]))\n",
    "    print(\"b1 = \" + str(parameters[\"b1\"]))\n",
    "    print(\"W2 = \" + str(parameters[\"W2\"]))\n",
    "    print(\"b2 = \" + str(parameters[\"b2\"]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Expected Output**: \n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **W1**\n",
    "        </td>\n",
    "        <td>\n",
    "         < tf.Variable 'W1:0' shape=(25, 12288) dtype=float32_ref >\n",
    "        </td>\n",
    "    </tr>\n",
    "    <tr> \n",
    "        <td>\n",
    "            **b1**\n",
    "        </td>\n",
    "        <td>\n",
    "        < tf.Variable 'b1:0' shape=(25, 1) dtype=float32_ref >\n",
    "        </td>\n",
    "    </tr>\n",
    "    <tr> \n",
    "        <td>\n",
    "            **W2**\n",
    "        </td>\n",
    "        <td>\n",
    "        < tf.Variable 'W2:0' shape=(12, 25) dtype=float32_ref >\n",
    "        </td>\n",
    "    </tr>\n",
    "    <tr> \n",
    "        <td>\n",
    "            **b2**\n",
    "        </td>\n",
    "        <td>\n",
    "        < tf.Variable 'b2:0' shape=(12, 1) dtype=float32_ref >\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As expected, the parameters haven't been evaluated yet."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.3 - tensorflow 的前向传播\n",
    "\n",
    "您现在将实现tensorflow中的前向传播模块。该函数将带有一个参数字典，它将完成前向传递。您将使用的功能是：\n",
    "\n",
    "- `tf.add(...,...)` to do an addition\n",
    "- `tf.matmul(...,...)` to do a matrix multiplication\n",
    "- `tf.nn.relu(...)` to apply the ReLU activation\n",
    "\n",
    "**Question:** 实现神经网络的前向传递. We commented for you the numpy equivalents so that you can compare the tensorflow implementation to numpy. It is important to note that the forward propagation stops at `z3`. The reason is that in tensorflow the last linear layer output is given as input to the function computing the loss. Therefore, you don't need `a3`!\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: forward_propagation\n",
    "\n",
    "def forward_propagation(X, parameters):\n",
    "    \"\"\"\n",
    "    Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX\n",
    "    \n",
    "    Arguments:\n",
    "    X -- input dataset placeholder, of shape (input size, number of examples)\n",
    "    parameters -- python dictionary containing your parameters \"W1\", \"b1\", \"W2\", \"b2\", \"W3\", \"b3\"\n",
    "                  the shapes are given in initialize_parameters\n",
    "\n",
    "    Returns:\n",
    "    Z3 -- the output of the last LINEAR unit\n",
    "    \"\"\"\n",
    "    \n",
    "    # Retrieve the parameters from the dictionary \"parameters\" \n",
    "    W1 = parameters['W1']\n",
    "    b1 = parameters['b1']\n",
    "    W2 = parameters['W2']\n",
    "    b2 = parameters['b2']\n",
    "    W3 = parameters['W3']\n",
    "    b3 = parameters['b3']\n",
    "    \n",
    "    ### START CODE HERE ### (approx. 5 lines)              # Numpy Equivalents:\n",
    "    Z1 = tf.matmul(W1, X)+b1                               # Z1 = np.dot(W1, X) + b1\n",
    "    A1 = tf.nn.relu(Z1)                                    # A1 = relu(Z1)\n",
    "    Z2 = tf.matmul(W2, A1)+b2                              # Z2 = np.dot(W2, a1) + b2\n",
    "    A2 = tf.nn.relu(Z2)                                    # A2 = relu(Z2)\n",
    "    Z3 = tf.matmul(W3, A2)+b3                              # Z3 = np.dot(W3,Z2) + b3\n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    return Z3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "tf.reset_default_graph()\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    X, Y = create_placeholders(12288, 6)\n",
    "    parameters = initialize_parameters()\n",
    "    Z3 = forward_propagation(X, parameters)\n",
    "    print(\"Z3 = \" + str(Z3))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Expected Output**: \n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **Z3**\n",
    "        </td>\n",
    "        <td>\n",
    "        Tensor(\"Add_2:0\", shape=(6, ?), dtype=float32)\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "您可能已经注意到，前向传播不会输出任何缓存. You will understand why below, when we get to brackpropagation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.4 Compute cost\n",
    "\n",
    "就像之前看到的，我们使用下面的方法计算成本非常简单：\n",
    "```python\n",
    "tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = ..., labels = ...))\n",
    "```\n",
    "**Question**: 实现下面的 cost function. \n",
    "- It is important to know that the \"`logits`\" and \"`labels`\" inputs of `tf.nn.softmax_cross_entropy_with_logits` are expected to be of shape (number of examples, num_classes). We have thus transposed Z3 and Y for you.\n",
    "- Besides, `tf.reduce_mean` basically does the summation over the examples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# GRADED FUNCTION: compute_cost \n",
    "\n",
    "def compute_cost(Z3, Y):\n",
    "    \"\"\"\n",
    "    Computes the cost\n",
    "    \n",
    "    Arguments:\n",
    "    Z3 -- output of forward propagation (output of the last LINEAR unit), of shape (6, number of examples)\n",
    "    Y -- \"true\" labels vector placeholder, same shape as Z3\n",
    "    \n",
    "    Returns:\n",
    "    cost - Tensor of the cost function\n",
    "    \"\"\"\n",
    "    \n",
    "    # to fit the tensorflow requirement for tf.nn.softmax_cross_entropy_with_logits(...,...)\n",
    "    logits = tf.transpose(Z3)\n",
    "    labels = tf.transpose(Y)\n",
    "    \n",
    "    ### START CODE HERE ### (1 line of code)\n",
    "    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))\n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    return cost"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "tf.reset_default_graph()\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    X, Y = create_placeholders(12288, 6)\n",
    "    parameters = initialize_parameters()\n",
    "    Z3 = forward_propagation(X, parameters)\n",
    "    cost = compute_cost(Z3, Y)\n",
    "    print(\"cost = \" + str(cost))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Expected Output**: \n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **cost**\n",
    "        </td>\n",
    "        <td>\n",
    "        Tensor(\"Mean:0\", shape=(), dtype=float32)\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.5 - Backward propagation & parameter updates\n",
    "\n",
    "所有的反向传播和参数更新都在1行代码中处理。\n",
    "\n",
    "计算成本函数后。你将创建一个\"`optimizer`\"对象. 在运行tf.session时，你必须调用这个对象和成本函数. 当被调用时，它将根据所选择的方法和学习率对给定的成本进行优化。\n",
    "\n",
    "例如，对于梯度下降，优化器将是：\n",
    "```python\n",
    "optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)\n",
    "```\n",
    "\n",
    "要进行优化，您可以这样做：\n",
    "```python\n",
    "_ , c = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})\n",
    "```\n",
    "\n",
    "This computes the backpropagation by passing through the tensorflow graph in the reverse order. From cost to inputs.\n",
    "\n",
    "**Note** 在编码时，我们经常使用“一次性”变量 下滑线 `_` 来存储我们稍后不需要使用的值。 `_` takes on the evaluated value of `optimizer`, which we don't need (and `c` takes the value of the `cost` variable). "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.6 - Building the model\n",
    "\n",
    "Now, you will bring it all together! \n",
    "\n",
    "**Exercise:** 完成模型。你将会调用你之前已经实现的功能。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,\n",
    "          num_epochs = 1500, minibatch_size = 32, print_cost = True):\n",
    "    \"\"\"\n",
    "    Implements a three-layer tensorflow neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SOFTMAX.\n",
    "    \n",
    "    Arguments:\n",
    "    X_train -- training set, of shape (input size = 12288, number of training examples = 1080)\n",
    "    Y_train -- test set, of shape (output size = 6, number of training examples = 1080)\n",
    "    X_test -- training set, of shape (input size = 12288, number of training examples = 120)\n",
    "    Y_test -- test set, of shape (output size = 6, number of test examples = 120)\n",
    "    learning_rate -- learning rate of the optimization\n",
    "    num_epochs -- number of epochs of the optimization loop\n",
    "    minibatch_size -- size of a minibatch\n",
    "    print_cost -- True to print the cost every 100 epochs\n",
    "    \n",
    "    Returns:\n",
    "    parameters -- parameters learnt by the model. They can then be used to predict.\n",
    "    \"\"\"\n",
    "    \n",
    "    ops.reset_default_graph()                         # to be able to rerun the model without overwriting tf variables\n",
    "    tf.set_random_seed(1)                             # to keep consistent results\n",
    "    seed = 3                                          # to keep consistent results\n",
    "    (n_x, m) = X_train.shape                          # (n_x: input size, m : number of examples in the train set)\n",
    "    n_y = Y_train.shape[0]                            # n_y : output size\n",
    "    costs = []                                        # To keep track of the cost\n",
    "    \n",
    "    # Create Placeholders of shape (n_x, n_y)\n",
    "    ### START CODE HERE ### (1 line)\n",
    "    X, Y = create_placeholders(n_x, n_y)\n",
    "    ### END CODE HERE ###\n",
    "\n",
    "    # Initialize parameters\n",
    "    ### START CODE HERE ### (1 line)\n",
    "    parameters = initialize_parameters()\n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    # Forward propagation: Build the forward propagation in the tensorflow graph\n",
    "    ### START CODE HERE ### (1 line)\n",
    "    Z3 = forward_propagation(X, parameters)\n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    # Cost function: Add cost function to tensorflow graph\n",
    "    ### START CODE HERE ### (1 line)\n",
    "    cost = compute_cost(Z3, Y)\n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer.\n",
    "    ### START CODE HERE ### (1 line)\n",
    "    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)\n",
    "    ### END CODE HERE ###\n",
    "    \n",
    "    # Initialize all the variables\n",
    "    init = tf.global_variables_initializer()\n",
    "\n",
    "    # Start the session to compute the tensorflow graph\n",
    "    with tf.Session() as sess:\n",
    "        \n",
    "        # Run the initialization\n",
    "        sess.run(init)\n",
    "        \n",
    "        # Do the training loop\n",
    "        for epoch in range(num_epochs):\n",
    "\n",
    "            epoch_cost = 0.                       # Defines a cost related to an epoch\n",
    "            num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set\n",
    "            seed = seed + 1\n",
    "            minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)\n",
    "\n",
    "            for minibatch in minibatches:\n",
    "\n",
    "                # Select a minibatch\n",
    "                (minibatch_X, minibatch_Y) = minibatch\n",
    "                \n",
    "                # IMPORTANT: The line that runs the graph on a minibatch.\n",
    "                # Run the session to execute the \"optimizer\" and the \"cost\", the feedict should contain a minibatch for (X,Y).\n",
    "                ### START CODE HERE ### (1 line)\n",
    "                _ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})\n",
    "                ### END CODE HERE ###\n",
    "                \n",
    "                epoch_cost += minibatch_cost / num_minibatches\n",
    "\n",
    "            # Print the cost every epoch\n",
    "            if print_cost == True and epoch % 100 == 0:\n",
    "                print (\"Cost after epoch %i: %f\" % (epoch, epoch_cost))\n",
    "            if print_cost == True and epoch % 5 == 0:\n",
    "                costs.append(epoch_cost)\n",
    "                \n",
    "        # plot the cost\n",
    "        plt.plot(np.squeeze(costs))\n",
    "        plt.ylabel('cost')\n",
    "        plt.xlabel('iterations (per tens)')\n",
    "        plt.title(\"Learning rate =\" + str(learning_rate))\n",
    "        plt.show()\n",
    "\n",
    "        # lets save the parameters in a variable\n",
    "        parameters = sess.run(parameters)\n",
    "        print (\"Parameters have been trained!\")\n",
    "\n",
    "        # Calculate the correct predictions\n",
    "        correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))\n",
    "\n",
    "        # Calculate accuracy on the test set\n",
    "        accuracy = tf.reduce_mean(tf.cast(correct_prediction, \"float\"))\n",
    "\n",
    "        print (\"Train Accuracy:\", accuracy.eval({X: X_train, Y: Y_train}))\n",
    "        print (\"Test Accuracy:\", accuracy.eval({X: X_test, Y: Y_test}))\n",
    "        \n",
    "        return parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "运行下面的单元格来训练你的模型！在我们的机器上大约需要5分钟. Your \"Cost after epoch 100\" should be 1.016458. 如果不是，不要浪费时间; 通过点击 notebook 上方的方块 (⬛)去中断训练, 并尝试更正您的代码. 如果损失是正确的，请休息一下，5分钟后回来！"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "parameters = model(X_train, Y_train, X_test, Y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Expected Output**:\n",
    "\n",
    "<table> \n",
    "    <tr> \n",
    "        <td>\n",
    "            **Train Accuracy**\n",
    "        </td>\n",
    "        <td>\n",
    "        0.999074\n",
    "        </td>\n",
    "    </tr>\n",
    "    <tr> \n",
    "        <td>\n",
    "            **Test Accuracy**\n",
    "        </td>\n",
    "        <td>\n",
    "        0.716667\n",
    "        </td>\n",
    "    </tr>\n",
    "\n",
    "</table>\n",
    "\n",
    "太令人惊讶了，您的算法可以识别代表0到5之间的数字的符号，准确率为71.7％。\n",
    "\n",
    "**Insights**:\n",
    "- 你的模型似乎足够大去拟合训练集。但是，考虑到训练集和测试集精确度之间的差异，您可以尝试添加 L2 或 dropout 正则化去减少过拟合。\n",
    "- 把会话视为一组训练模型的代码。每次在小批量集上运行会话时，都会训练参数。In total you have run the session a large number of times (1500 epochs) until you obtained well trained parameters."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.7 - Test with your own image (optional / ungraded exercise)\n",
    "\n",
    "祝贺您完成这项任务。您现在可以拍摄您的手的照片，并查看模型的输出：\n",
    "    1. Click on \"File\" in the upper bar of this notebook, then click \"Open\" to go on your Coursera Hub.\n",
    "    2. Add your image to this Jupyter Notebook's directory, in the \"images\" folder\n",
    "    3. Write your image's name in the following code\n",
    "    4. Run the code and check if the algorithm is right!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import scipy\n",
    "from PIL import Image\n",
    "from scipy import ndimage\n",
    "\n",
    "## START CODE HERE ## (PUT YOUR IMAGE NAME) \n",
    "my_image = \"thumbs_up.jpg\"\n",
    "## END CODE HERE ##\n",
    "\n",
    "# We preprocess your image to fit your algorithm.\n",
    "fname = \"images/\" + my_image\n",
    "image = np.array(ndimage.imread(fname, flatten=False))\n",
    "my_image = scipy.misc.imresize(image, size=(64,64)).reshape((1, 64*64*3)).T\n",
    "my_image_prediction = predict(my_image, parameters)\n",
    "\n",
    "plt.imshow(image)\n",
    "print(\"Your algorithm predicts: y = \" + str(np.squeeze(my_image_prediction)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You indeed deserved a \"thumbs-up\" although as you can see the algorithm seems to classify it incorrectly. 原因是训练集不包含任何“竖起大拇指”的图像，所以模型不知道该如何处理!我们称这个为 \"mismatched data distribution\"， 是下一个课程 \"Structuring Machine Learning Projects\" 要讲解的内容之一."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "<font color='blue'>\n",
    "**What you should remember**:\n",
    "- 在深度学习中 Tensorflow 是一个编程框架\n",
    "- tensorflow 有两个很重要的对象类 Tensors 和 Operators. \n",
    "- 当你用 tensorflow 编码时，你的编码步骤必须是：\n",
    "    - 创建一个包含 Tensors (Variables, Placeholders ...) 和 Operations (tf.matmul, tf.add, ...)的计算图\n",
    "    - 创建一个会话\n",
    "    - 初始化会话\n",
    "    - 运行会话以执行计算图\n",
    "- 您能多次执行计算图就像你在 model（）中看到的那样\n",
    "- 当运行会话里的 \"optimizer\" 对象时，反向传播和优化会自动完成。"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "coursera": {
   "course_slug": "deep-neural-network",
   "graded_item_id": "BFd89",
   "launcher_item_id": "AH2rK"
  },
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
